Realtime Viterbi Searching for Practical Telephone Speech Recognition Systems

نویسندگان

  • Jin Zhang
  • Jia Liu
  • Run-sheng Liu
چکیده

This paper studies searching and pruning process of the telephone speech recognition system for Private Automatic Branch Exchange (PABX) to explore the possible problems encountered in applying speech recognition to telephone network and to prepare the necessary techniques for the practical telephone speech recognition systems. Experiment on a baseline system which uses semi-syllable based multisubtree decoding structure and a classical Viterbi beam search algorithm achieves 89.86% keyword accuracy rate. By employing the dynamic threshold method, the keyword accuracy can reach 93.48 %. By employing the 'speed up jumping strategy', we achieve a higher performance with 97.35 % in keyword accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Lexical stress modeling for improved speech recognition of spontaneous telephone speech in the jupiter domain

This paper examines an approach of using lexical stress models to improve the speech recognition performance on spontaneous telephone speech. We analyzed the correlation of various pitch, energy, and duration measurements with lexical stress on a large corpus of spontaneous utterances, and identified the most informative features of stress using classification experiments. We incorporated the s...

متن کامل

VLSI Architecture of GMM Processing and Viterbi Decoder for 60, 000-Word Real-Time Continuous Speech Recognition

We propose a low-memory-bandwidth, high-efficiency VLSI architecture for 60-k word real-time continuous speech recognition. Our architecture includes a cache architecture using the locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, a parallel Gaussian Mixture Model (GMM) architecture based on the mixture level and frame level, a parallel ...

متن کامل

A detection approach to search-space reduction for HMM state alignment in speaker verification

To support speaker verification (SV) in portable devices and in telephone servers with millions of users, a fast algorithm for hidden Markov model (HMM) alignment is necessary. Currently, the most popular algorithm is the Viterbi algorithm with beam search to reduce search-space; however, it is difficult to determine a suitable beam width beforehand. A small beam width may miss the optimal path...

متن کامل

Large vocabulary decoding and confidence estimation using word posterior probabilities

This paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system. A novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented. The problem of the robust estimation of confidence scores from word posteriors ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002